A real estate transaction can be an emotional time for everyone. The complexities between buyers and sellers are the result of different experiences and expectations. Success in today's market is guided by knowledge, communication, and partnership.
Buyers are waiting later in life to purchase their first home. They have very specific expecations on what they are looking for, and willing to take the time to get exactly what they want. To be successful, buyers will turn to experienced professionals to guide them through the buying process and to sift through the voluminous of data.
Sellers past experiences have been rooted in market conditions significantly different than we aree seeing today. Many are resisting the realities of the market and are slow to react to the valuable feedback the data provides. To be successful, sellers will need to utilize skilled professionals to interpret the specifics of today's market and take swift action to adjust for changing trends.
==========================================================================================
Base Real Estate data provided by: Zillow
Base Federal Reserve data provided by: kaggle
Zillow Data: Timeseries Real Estate data by ZipCode U.S.
Zillow Home Value Index (ZHVI): A smoothed, seassonally adjusted measure of the median estimated home value across a given region and housing type. It is a dollar-denominated alternative to repeat-sales indices.
Datahub.io: U.S., National Yearly Economic Reports
Dataset Info: Economic
Interest Rates: Forecasted for 2016, 2017, 2018
Note: Kaggel Federal Reserve datasets proved to be useless, full of gaps and limited time series data to provide value. Economic data was pulled from the above mentioned sources and munged together to form a more useable data set.
Dataset Info: Real Estate
Zillow Home Value Index (ZHVI): A smoothed, seassonally adjusted measure of the median estimated home value across a given region and housing type. It is a dollar-denominated alternative to repeat-sales indices.
OBTAIN Interest Rates data from Kaggel
Obtain Economic Data from datahub.io)
Clean and perform initial transformations steps of the data
Zillow Single Family Residence DataFrame Head:
A look at the datasets distributions of elements to determin best methods for cleaning the data
This process continuous for the remainder of the datasets. See accompaning notebook for details.
Merged Dataframe of Economic features aggregated from their individual source files
Correlation Heatmap of the new Economic Dataset's features
Time series analysis on real estate median average price by zipcode
Transform Real Estate data for time series analysis
Note: All timeseries models were ran prior on google colab and saved as pickle files for continued downstream application
Price Trend from 1997 through 2017 - With a 12 month future prediction...
Price Trend from 1997 through 2018 - With a 5 year future prediction...
Description: Run k-means for three choices for k and choose the best.
A loop of 10 iterations were ran of the zipecode models generated from the Timeseries process ran above. Based on the output of the Elbow technique K=4 was the best chosen choose.
Intent: Try and use unsupervised learing techniques to classify Timeseries models produced by prophet.
Which are the best forecasters?
Get all of the zip code forecast prediction models that generated in section 2 from disc, and prep for kmeans
Rusulting Cluster Classification at K equal 4
Python package: scikit-learn sklearn.tree.DecisionTreeClassifier
*Build a decision tree model.
Time range - 1997 - 2017 (that was the cleanest that could be achieved at this time...
*Train classifiers on Feature 'Price_Point_Class'
--Determin if classifiers can identify future home value classes based on prior date, location and economic features that have the most impact on both postive and negative price value swings...
Python Package: scikit-learn v0.21.3 sklearn.ensemble.RandomForestClassifier
A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting. The sub-sample size is always the same as the original input sample size but the samples are drawn with replacement if bootstrap=True (default).
Python Package: SciKit-Learn Gaussian Naive Bayes
Build a naïve Bayes model. Tune the parameters, such as the discretization options, to compare results.
Real estate housing market trends are impacted by many factors that require deep data mining techniques and domain experts to pull the right data together and engineer it in meaningful ways to gain insights into this industry. Data proved to be the most challening component of this research. There is a lack of quality datasets that are easily found which inhibits possible discoveries.
Certainly economic indicators are present that signal swings in price trends... Further research on comprehensive, state level economics is needed to expand on the datasets used in this study, which were at the national level. Most likely it's this that caused the inconsistencies with the models performance. The Real estate data being focused on was at the state level, whereas the economic data was at the national yearly average. This abstraction could have been a leading cause.